Building multi-way decision trees with numerical attributes

نویسندگان

  • Fernando Berzal Galiano
  • Juan C. Cubero
  • Nicolás Marín
  • Daniel Sánchez
چکیده

Decision trees are probably the most popular and commonly used classification model. They are recursively built following a top-down approach (from general concepts to particular examples) by repeated splits of the training dataset. When this dataset contains numerical attributes, binary splits are usually performed by choosing the threshold value which minimizes the impurity measure used as splitting criterion (e.g. C4.5 gain ratio criterion or CART Gini’s index). In this paper we propose the use of multi-way splits for continuous attributes in order to reduce the tree complexity without decreasing classification accuracy. This can be done by intertwining a hierarchical clustering algorithm with the usual greedy decision tree learning. 2003 Elsevier Inc. All rights reserved.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A New Algorithm for Optimization of Fuzzy Decision Tree in Data Mining

Decision-tree algorithms provide one of the most popular methodologies for symbolic knowledge acquisition. The resulting knowledge, a symbolic decision tree along with a simple inference mechanism, has been praised for comprehensibility. The most comprehensible decision trees have been designed for perfect symbolic data. Classical crisp decision trees (DT) are widely applied to classification t...

متن کامل

A New Sampling Strategy for Building Decision Trees from Large Databases

We propose a fast and e¢cient sampling strategy to build decision trees from a very large database, even when there are many numerical attributes which must be discretized at each step. Successive samples are used, one on each tree node. Applying the method to a simulated database (virtually in…nite size) con…rms that when the database is large and contains many numerical attributes, our strate...

متن کامل

Building a maintenance policy through a multi-criterion decision-making model

A major competitive advantage of production and service systems is establishing a proper maintenance policy. Therefore, maintenance managers should make maintenance decisions that best fit their systems. Multi-criterion decision-making methods can take into account a number of aspects associated with the competitiveness factors of a system. This paper presents a multi-criterio...

متن کامل

Multi-criteria Decision Making Approach: selection of Blanking Die Material (TECHNICAL NOTE)

Proper selection of material in manufacturing firms is a vital role of designer depending upon the different era of application. The material selection problem is very complex and challenging task today. Erroneous cull of material frequently leads to astronomically immense cost involution, and finally drives towards unfortunate component or product breakdown. Thus, the designer necessitates dis...

متن کامل

Finding Optimal Multi-Splits for Numerical Attributes in Decision Tree Learning

Handling continuous attribute ranges remains a deeciency of top-down induction of decision trees. They require special treatment and do not t the learning scheme as well as one could hope for. Nevertheless, they are common in practical tasks and, therefore, need to be taken into account. This topic has attracted abundant attention in recent years. In particular , Fayyad and Irani showed how opt...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Inf. Sci.

دوره 165  شماره 

صفحات  -

تاریخ انتشار 2004